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ABSTRACT 

Astronomical imaging using aperture synthesis telescopes requires deconvolution of the point spread function as well as calibration 
of instrumental and atmospheric elfects. In general, such effects are time-variable and vary across the field of view as well, resulting 
in direction-dependent (DD), time-varying gains. Most existing imaging and calibration algorithms assume that the corruptions are 
direction independent, preventing even moderate dynamic range full-beam, full-Stokes imaging. We present a general framework for 
imaging algorithms which incorporate DD errors. We describe as well an iterative deconvolution algorithm that corrects known DD 
errors due to the antenna power patterns (including errors due to the antenna polarization response) as well as pointing errors for high 
dynamic range full-beam polarimetric imaging. Using simulations we demonstrate that errors due to realistic primary beams as well 
as antenna pointing errors will limit the dynamic range of upcoming higher sensitivity instruments like the EVLA and ALMA and 
that our new algorithm can be used to correct for such errors. We show that the technique described here corrects for effects that can 
be described as approximate unitary operators in the interferometric measurement equation, such as those due to antenna pointing 
errors and non-azimuthally symmetric antenna power patterns. We have applied this algorithm to VLA 1.4 GHz observations of a 
field that contains two "4C" sources and have obtained Stokes I and V images with systematic errors that are one order of magnitude 
lower than those obtained with conventional imaging tools. Residual systematic errors that are seen at a level slightly above that of the 
thermal noise are likely due to selfcalibration instabilities that are triggered by a combination of unknown pointing errors and errors 
in our assumption of the shape of the primary beam of each antenna. We hope to present a more refined algorithm to deal with the 
fully general case in due course. Our simulations show that on data with no other calibration errors, the algorithm corrects pointing 
errors as well as errors due to known asymmetries in the antenna pattern. 

Key words. Methods: data analysis - radio interferometry ~ Techniques: image processing - deconvolution 



1. Introduction 

Astronomical observations made with interferometric radio tele- 
scopes suffer from variable gain effects which can be broadly 
classified as direction-independent (DI) and direction-dependent 
(DD) errors. The complex instrumental gains due to the elec- 
tronic devices that follow the feed elements are directionally 
independent, whereas the time-varying gains due to the an- 
tenna primary beams provide an example of direction-dependent 
gains. Direction-independent effects may be corrected separately 
from imaging, so most processing algorithms have been limited 
to treating such effects. Direction-dependent gains are more dif- 
ficult to incorporate since they must be applied during imaging 
which has slowed progress thus far. This must now change, how- 
ever, because direction dependent gains are expected to limit ob- 
servations with existing as well as next generation telescopes 
presently under construction. Indeed, a number of deep observa- 
tions made with present telescopes have been limited already by 
direction-dependent, time-variable errors. In this, the first of two 
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papers, we present an algorithmic framework that allows incor- 
poration of a class of directionally dependent gain effects during 
deconvolution. A key part of this framework is the provision of 
an efficient transform between the data and image domains. As 
an example of the application of this framework, we demonstrate 
that the effects of known antenna pointing errors and beam po- 
larization can be corrected during imaging. We also discuss er- 
ror propagation and the required computing resources. A second 
paper will describe an algorithm constructed within this frame- 
work that allows solving for parameters that describe such gain 
changes. 



2. The Measurement Equation 

The measurement equation that describes astronomical imaging 
using aperture synthesis telescopes can be compactly written us- 
ing the Hamaker-Bregman-Sault notation ( |Hamaker et al.|1996] l 
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(1) 



where V is the observed full polarization visibility vector, M.. 



and M?'^^(s) are the Mueller Matrices (Mueller 
DD gains respectively, s is a direction in the 



1948|lforDIand 

sky. 



is the im- 
age and bij is the vector that describes the projected separation 
between the antennas / and j in units of the wavelength of the 
observation. 

Measurements sample this equation, providing constraints 
on the unknowns on the right hand side. The sky brightness 
distribution I(s) has to be estimated in the presence of known 
or unknown gain terms My and M?'^^. However, the measure- 
ment equation cannot simply be inverted as it is not a Fourier 
transform. Furthermore, as is the case for simpler forms of the 
measurement equation, the visibility is sampled only at a lim- 
ited set of points so there is insufficient information to deter- 
mine the solution exactly. We follow the normal terminology 
and call the estimation of the sky brightness "deconvolution" 
though it is only vaguely related to the typical deconvolution of 
a shift-invariant point-spread function ( [Andrews & Hunt 1977 ). 
The other part of imaging is the (necessary) estimation and cor- 
rection of the gains either assuming that the sky brightness dis- 
tribution is known ("calibration") or determined simultaneously 
("self-calibration"). 

Most often, deconvolution and self-calibration are performed 
using simple steepest-descent algorithms. The residuals are min- 
imized in the least square sense by minimizing which can be 
expressed in terms of the data (V) and the model data (V*') as 



X' 



(2) 



where A is the inverse of the measurement noise covariance ma- 
trix. The sky-brightness model, the unknown gain terms as well 
as suitable imaging weights are included in V'^ . Gradients of 
with respect to the sky brightness model and the gain terms 
may be calculated straightforwardly and used in an iterative min- 
imization algorithm to solve for the sky brightness model and 
unknown gains (Schwarz 1978 Bhatnagar & Cornwell 20041. 



Often, this results in complex and non-linear algorithms but 
these seem to work well most of the time. To make a practical 
algorithm, it is necessary to find efficient ways of calculating the 
gradients of with respect to the unknowns. In this paper, we 
consider the case of estimating the sky brightness in the presence 
of known directionally dependent gain terms. In a subsequent 
paper, we will consider the estimation of parameters describing 
unknown gain terms. 

There are cases in which the deconvolution and correction 
for the Mueller matrix can be decoupled. For example, direction 
dependent effects which are identical for all the measurements 
can be removed by dividing the deconvolved image by M^'^^. 
The correction for an azimuthally symmetric and time-constant 
antenna power pattern provides one such example. The deconvo- 
lution is then performed on the entire data set, ignoring the an- 
tenna power pattern, whose inverse function is applied to the im- 
age only after the deconvolution and self-calibration have been 



' All expressions in this paper are in the signal-domain polarization 
frame (the linear or circular polarization bases). Conversion to and from 
the Stokes frame can be done by the application of an appropriate co- 
ordinate transform operator ( |Hamaker et al.|1996^ . Note that only the 
minor cycle of the deconvolution iterations operate in the Stokes frame 
(see section |4]l. 



completed. However, this assumption often breaks down. For 
example, the antenna power pattern for most practical antenna 
geometries is azimuthally asymmetric (because of off-axis feed 
position, asymmetric subreflector, feed-legs, . . . ) and rotates on 
the sky for azimuth-elevation mount telescopes resulting in off- 
axis gains which vary with parallactic angle (PA). In such case, 
processing time slices of the data independently might lower 
the range of the gain variations in each subset. The final decon- 
volved images for each subset are averaged post-deconvolution. 
This is straightforward and often used, but can be expected to be 
sub-optimal because the deconvolution step is inherently non- 
linear and higher PSF sidelobes for individual subsets increase 
the level of (non-symmetric) deconvolution eiTors in each sub- 
image. Hence it would seem preferable to follow a procedure 
that applied the corrections while imaging the full data set. 



3. Example of directionally dependent gains 

In this section, we consider in detail an example of directionally 
dependent gains - the antenna far-field voltage pattern. 

The far-field voltage pattern is the Fourier transform of the 
antenna illumination function ( Kraus[ 1986). Thus it is typically 
the case that because of the details of the antenna geometry (such 
as quadrupod legs) and feed design, the antenna voltage patterns 
are azimuthally asymmetric. Furthermore, the polarization re- 
sponse of the antenna will vary away from the antenna optical 
axis due to antenna geometry and the physics of the reflection 
of electromagnetic waves from curved surfaces. In addition, as 
an interferometric array composed of altitude-elevation mounted 
antennas tracks a region of the sky, these asymmetrical antenna 
voltage patterns rotate on the sky. This, along with significant 
time varying antenna pointing errors, makes M^'^^ time varying 
and different for each antenna pair (interferometric baseline). 
Even equatoiially mounted antennas share in this problem to the 
extent that changes in elevation (temperature) might deform the 
antennas due to gravity (dilation). 



3. 1 . The Jones and Mueller matrices 

The Mueller matrix is an outer product of the two antenna 
based Jones matrices Pones|194Tj Hamaker et al.|1996l l. A full 
direction-dependent polarimetric description requires a Jones 
matrix per pixel in the image. For the two orthogonal polariza- 
tions, labeled p and q, the Sky Jones matrix as a function of 
direction is given by: 



iqp 



ipq 



(3) 



The super-scripts pq and qp represent leakage of the q- 
polarization signal to /9 -polarization signal and vice-versa. The 
diagonal elements coiTespond to the antenna voltage patterns on 
the sky while the off-diagonal elements coiTespond to the polar- 
ization leakage terms (p — > ^ and q — > p) due to instrumen- 
tal leakage (antenna geometry, electronics) and/or atmospheric, 
ionospheric or other transmission effects such as Faraday rota- 
tion. 
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The full direction-dependent Sky Mueller matrix M|j for 
baseline y is a 4 x 4 matrix: 

Mf'is) = J 



(4) 



The diagonal elements of this matrix are the antenna power pat- 
terns for the four polarization products, whereas the ofF-diagonal 
products incorporate the cross-polarization leakage terms. For 
the VLA antennas, where the two circular polarization power 
patterns are squinted with respect to each other, the power pat- 
tern for the parallel hand (J[^J[' ) and the difference between 

a parallel hand and a cross hand product (Jl^J^* - Ji'^Jj" ) are 
shown in Fig.[T](super-script R and L denotes the right- and left- 
circular polarizations respectively). The main lobe of the power 
pattern is azimuthally asymmetric and, clearly, highly asymmet- 
ric in the first sidelobe. This asymmetry is due to aperture block- 
age by the feed and the feed-legs. The two parallel-hand power 

patterns (J^J^ and diagonal terms) are also not identical 
because of differences between the power patterns for the two 
orthogonal polarizations. Rotation of these patterns on the sky 
as a function of PA leads to time varying, direction-dependent 
gains. Even differences between two diagonal terms, for exam- 
ple Ji'^Jj^ - Jj'^Jj" , are on the order of a few percent and vary 
with position within the beam. In addition, differences between 
antennas (focus, surface accuracy, pointing) will lead to second- 
order differences in the values of any given term. Therefore, an 

Skv 

assumption of diagonal form for M^j will lead to errors in the 
image domain - particularly in the presence of strong sources lo- 
cated in the outer parts of the main lobe as well as in the first few 
sidelobes. 

For high dynamic range imaging (>fewxlOOO), the off- 
diagonal terms of the Mueller matrix are non-negligible, vary 
across the entire beam and, typically, increase substantially with 
distance from the center. Figure [2] shows the figure is the cor- 
rect one typical off-diagonal terms for the VLA (the cross polar 

power patterns J[^J["" and J^'^'-Jj'^'-*). These are the higher order 
leakage terms and are purely due to the leakage of the orthogo- 
nal polarization signals into the complementary polarization in 
the signal path from various antennas The term shown in the 
left-hand-side panel is the first order leakage term and has peak 
amplitude relative to the diagonal terms of ~ 10"^. Since this 
is comparable to the difference between the diagonal terms, ig- 
noring this term will result in imaging artifacts similar to those 
due to the assumption that parallel- and cross-hand power pat- 
terns are identical. The second order term J.'^'-j!^'- , shown in the 
right hand side panel, has an amplitude of ~ 10 Consequently, 
for high dynamic range imaging (>fewxlOOO), the M?^^ can- 
not be approximated as even a diagonally dominant matrix. For 
even moderate dynamic range full-Stokes, full-beam imaging, 
this difference needs to be taken into account. 

Existing calibration procedures split the errors into two an- 
tenna based Jones matrices - the G and D matrices for complex 
gain and polarization leakage, respectively. G is assumed to be 
purely diagonal while D is unity along the diagonal and the off- 
diagonal terms are the leakage gains ( [Hamaker et al.|1996) . The 



directional dependence of these terms is ignored and the values 
of the complex gains and leakage gains at the center of the field 
are used throughout the beam. Typically, polarization leakage is 
small near the optical axis of the antenna. Hence, for imaging 
compact sources at the center of the beam, the Sky Mueller ma- 
trix is diagonally dominant and the above approximation is jus- 
tified. When imaging fields with significant emission throughout 
the primary beam, this approximation will lead to artifacts and 
significantly lower image fidelity away from the image center. 
This is even more true for the case of mosaic imaging ( Cornwell 
1988 1 where there is significant flux density throughout the pri- 



mary beam for most pointings. 

Therefore for high dynamic range full beam imaging, full 
treatment of polarization has to be kept in the entire imag- 
ing and calibration process. We refer to this as J-Matrix based 
imaging and calibration. Stokes images have to be made from 
the linear addition of the visibilities from all polarization prod- 
ucts, weighted by the appropriate terms of the Sky Mueller ma- 
trix. Strictly speaking, even conventional Stokes-I imaging using 
only the parallel hand visibilities is incorrect. For moderate dy- 
namic range full beam, full Stokes imaging, it may be possible 
to use only the diagonal terms of the Sky Mueller matrix. Note 
that the computational load of J-Matrix based imaging is a fac- 
tor of four higher than the corresponding load for conventional 
imaging (where the J-Matrix is assumed to be purely diagonal). 

4. A deconvolution algorithm incorporating 
direction-dependent gain correction 

As described above, most iterative deconvolution algorithms de- 
rive updates (A/*') to the existing model image (7*^) from the 
gradients of chi-square with respect to the unknown sky bright- 
ness' 

A/" = -Cdx^ldl^ = 0/* (5) 

where C is a scaling term, either a constant or the inverse 
Hessian (or an approximation thereof). 

Typically, the model image is iteratively improved as: 



(6) 



where is the Fourier transform of the residual visibilities (F*) 
and if is the cumulative model at the iteration. The operator 
T selects part of the gradient image. For the Hogbom Clean al- 
gorithm ( |Hogbom|1974[ l, T simply updates the model by adding 
a scaled version of the the peak in the image. 

Following terminology established by Clark (1980), the cal- 
culation of the derivative for a given estimate is called the major 
cycle, and the application of the T operator is called the minor 
cycle. The minor cycle typically operates in the Stokes frame. 
The operator T includes conversions between the signal-domain 
polarization frame and the Stokes frame using an appropriate 
coordinate transform operator ( [Hamaker et al.|1996 1. 

The major cycle can be broken into two calculations: 

forward: In the forward step, the model visibilities for baseline 
/- i are calculated from the existing model image using 



the equation: 



M 



(7) 

where the operator A is the measurement matrix, 
backward: In the backward step, the residual visibilities (V*) 
are propagated backwards to the image plane using the equa- 
tion: 

l'^ = [a' a]"' a' (8) 
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Fig. 1. The image in the left panel shows a typical (first) diagonal term for the VLA at 1.4 GHz (J^Jj' ) of the Sky Mueller matrix. The pattern is 
olf-center due to the off-axis location of the feeds. The image in the right panel shows the difference between the first and second diagonal term 
was evaluated at PA=0 and normalized by the peak of one of the diagonal terms (the peak values of both diagonal terms are 

the same). 
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Fig. 2. The off-diagonal terms of the Sky Mueller matrix for VLA antennas at 1.4 GHz. The image in the left panel is of first order in the antenna 
leakage (J." Jj"'- ). The image in the right panel is the second order term in antenna leakage (J^'^'-Jj'"- ). J®^** was evaluated at PA=0 and normalized 
by the peak of one of the diagonal terms (the peak values of both diagonal terms are the same). 



The forward calculation must be done with high accuracy, 
but since the overall approach is iterative, the backward calcu- 
lation may be performed with lower accuracy ( |Schwab||1983 
|Cotton|19 99 ) When using the FFT algorithm for computing the 
Fourier transform, the gridded visibilities F/*' are interpolated 
from a regular grid and re-sampled at the measured (m,v, w) 
points as: 

y^' («,■,-, v^j, wij) = (G [F/^]') (Uij, Vij, Wij) (9) 

where G is the interpolation operator, F is the Fourier trans- 
form operator and the superscript g indicates data on a regular 
grid. The backward calculation will correspond to the applica- 
tion of [G F] ' . The operator F is unitary. If G is at least approx- 
imately unitary, G ' can be used as the interpolation operator for 



re-sampling the data on a regular grid to correct for the effects of 
G in the image. Applying G and G ' as part of the re-sampling 
operations require that both of these operators have finite sup- 
port. In principle, any operator for which the inverse exists can 
be used, provided the inverse operator also has a finite support. 
However, since the inverse of an operator X involves det{Xy , 
it is difficult to imagine X"' with a support size comparable to 
X. An approximate inverse operator with finite support for our 
case can be constructed by using G ' for re-sampling the data 
(left-hand side of Eq. [9]l and then dividing the resulting image 
by t/er(FG). 

The similarity between Eqs.|9]and|7]indicates that direction- 
dependent gains can be incorporated as part of the deconvolu- 
tion iterations by using an efficient algorithm for the forward 
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Fig. 3. The otF-diagonal term of J^'^^* J^'^". Images in the left and right panels show the real and imaginary parts respectively. The overlayed 
contours correspond to max (jf ^"[0]) X [0, 0. 1 , 0. 15, 0. 18, 0.2, 0.4, 0.6, 0.8, 0.9]. The absolute maximum value inside the first null is ~ 0.2. 



and backward calculations. We have chosen to use a technique 
similar to that used in the w-projection algorithm to correct 
for the effects of non co-planar baselines ( Cornwell et al.|2003] l. 
As discussed in section 4.1 an approximately unitary operator 
can be constructed as the Fourier transform of Eq. |4j For our 

purpose, using Ey as the interpolation operator for gridding the 
visibilities on a regular grid and using FFT to invert the gridded 
visibilities would suffice. The accuracy of the forward calcula- 
tion is proportional to the accuracy of Ey which can, in principle, 
be arbitrarily precise (e.g. by accurate measurement of the an- 
tenna voltage pattern). An iterative deconvolution scheme using 
such transforms should ultimately drive the residual image to be 
noise-like, although it would seem desirable to limit the num- 
ber of free parameters introduced in the process. Note that since 
the final model image is iteratively built using accurate com- 
putations only in one direction, the intermediate residual dirty 
images have no physical meaning as is usually the case. 



4.1. Structure of the Sky Jones matrix 

J^^^ affects the measurements as described by Eq. [l] The for- 
ward and backward transforms discussed above crucially de- 

Skv 

pend on Ey being at least approximately unitary. Since My is 
an outer product of antenna based Jones matrices (Eq. [3| and 

Skv 

Ey = ;F7~(My ) where TT represents the element-by-element 
Fourier transform of its argument, for our purpose it is sufficient 
to ensure that the J^'^^ is approximately unitarjj^ The diago- 
nal terms of J^'^^ J^'^^ (of the form jP'') correspond 
to the ideal (un-squinted) power patterns and are nearly equal 
to each other. Fig. [3] shows the real and imaginary parts of the 
off-diagonal term normalized by det(^f^^y The peak amplitude 
is about two orders of magnitude lower than the diagonal term 
making J| approximately unitary. Image plane corrections there- 
fore can be incorporated as part of the image deconvolution pro- 
cedure by using Ey = 'FT [M^'^^j and E1 as part of the forward 



^ It can be shown that if J^^' is unitary, then so is J^'^'' 



.Sky 



and reverse transforms between the visibility and image domains 
for baseline /- j. 

4.2. The overall deconvolution algorithm 

Our deconvolution algorithm proceeds as follows: 

1 . Initialize: Set the initial model image to zero or to a model 
using apriori knowledge of the sky emission (for example a 
model obtained with conventional techniques). 

2. Major cycle: 

- Forward: Compute the residual visibilities y''*' - 
using the observed visibilities V'^*' for each polarization 
product. 

- Backward: Compute the residual image using Eqs. 11 
and[T2lbelow. 

3. Minor cycle: Update the model image applying some opera- 
tor T. 

4. Go to|2]until convergence is achieved, typically quantified by 
suitable stopping criteria (noise level, distribution of residu- 
als, etc.). 

5. Smooth the deconvolved image by the resolution element 
and add back the residuals. 



5. Antenna polarization and pointing error 
correction 

In the following section, we describe how the forward and back- 
ward calculations can account correctly for polarization leakage 
and pointing errors. 

5. 1. Forward calculation 

In the absence of antenna pointing errors, the operator E^" is the 
auto-correlation of the ideal antenna illumination patterns for 
polarization product P. It has finite support and is also approx- 
imately unitary. Therefore, it has the required properties to be 
used to realize the transforms necessary in a deconvolution algo- 
rithm to correct for primary beam effects. In the presence of an- 
tenna pointing errors, the operator E^" is different for each base- 



6 



S. Bhatnagar et al.: Correction of direction-dependent gains during image deconvolution 






.Hi _Tri — 



4CI JO ZO 10 -10 -ZO -30 -40 
Relative J2000 Right Ascension (aremin) 



+0 



30 



20 - 



10 - 



-1 



-20 



■"^ -30 I- 



-+0 



1 r 



1 I I r 



1 r 



J L 



+0 JO ZO 10 -10 -ZO -30 -40 
[Relative J2Ci00 Rigl^t Ascension (aremm) 



Fig. 4. The top row shows the Stokes-I images and the bottom row shows the Stokes-V images. The images on the left were made without squint 
and pointing correction while those on the right had both corrections applied. The deconvolution errors seen around the strongest sources are due 
to the antenna pointing errors and time varying direction dependent gain due to the rotation of azimuthally asymmetric antenna power patterns. 
These images were made using a linear transfer function with the gray scales in the range -20 yujy/beam (black) and +40 yujy/beam (white). The 
RMS noise in the off-source regions of the images in the left and the right panels is 10 /jjy/beam and 1 yuJy/beam respectively. 



line /- j. For small pointing errors compared to the half power 
beam width, pointing errors contribute a linear phase gradient 
across the aperture. Therefore, the full E? including antenna 
pointing errors can be efficiently evaluated by separating it into 
terms that include the effects which are equal for all antennas 
(e.g. the polarization squint of the VLA antennas) and effects 
which vary between antennas (e.g. the antenna-based pointing 
offsets) as: 

ep = E^°m - <i>y^'^^'^ (10) 

where 0, is the pointing offset for antenna / and E''° is the auto- 
correlation of the ideal antenna illumination pattern. The func- 
tion / represents the de-correlation that the signal suffers at each 
baseline due to the antenna pointing errors. This function is unity 



at the origin (/(O) - 1.0) and for voltage patterns with finite 
support, it will be a monotonically decreasing function of its ar- 
gument (for typical illumination functions) PI However, its exact 
form will depend upon the actual form of the voltage patterns. 
For small pointing errors (few percent of the half-power beam 
width), it will be close to unity to the first order. When using E^" 
as the visibility plane filter for baseline /- j, with no pointing er- 
rors the predicted visibilities will correspond to a sky tapered by 
the corresponding power pattern (as it should be). With pointing 
errors, the predicted visibilities will include the effects of point- 
ing errors. 

^ This might not be a monotonically decreasing function for all ways 
in which a feed might illuminate a secondary/primary reflector. 
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Table 1. Simulation parameters 



Range of Parallactic Angle(°) 1 83 - 45 

Number of visibilities 10^ 

Integration time(sec) 10 

Number of frequency channels 1 

Frequency(GHz) 1 .4 

Noise per sample(mjy) 1 

Max. baseline(Km) 3 

Number of antennas 27 



5.2. Backward calculation 

The backward calculation can be realized by using E^"* as the 

interpolation operator for re-sampling V^iujj, Vy) (the visibilities 
for the polarization product P) on a regular grid at pixels labeled 
by indices (n, m) as: 

V'^'^inAu, inAv) = (E^'v^{uij, Vy)) (nAu, inAv) (11) 

where the superscript G is used to indicate a regular grid. The 
images corresponding to the gridded visibilities are then com- 
puted as 



: det 



(fi[E-])- 



P,G 



(12) 



The net effect of Ey in the image domain (expressed by P' [E''' j 
in Eq. 12 above) is therefore averaged over all antennas for the 
entire range of parallactic angle coverage. 

6. Results 

6.1. Simulations 

The algorithm was tested for VLA squint and gain variations due 
to the rotation of azimuthally asymmetric antenna power pat- 
terns on the sky. 

The visibilities were simulated using the Cyi>Syj^ package 
with the parameters listed in Table [T| A model for typical sky 
emission at 1420 MHz was generated using the NVSS source 
list. The actual rendition has 74 point sources with flux densities 
ranging between 195 mJy and 2 mJy. A PA increment of 10° was 
used in order to simulate the rotation of the R- and L-beams on 
the sky. The visibilities were simulated for VLA C-array and an 
RMS noise of ~ 1 mJy per visibility sample was added to simu- 
late an image plane RMS noise of ~ 1 /iJy/beam. Without squint 
correction, the peak and RMS noise in the resultant Stokes-V im- 
age were ~ 2 mJy and ~ 10 /iJy/beam respectively. The Stokes- 
V image generated using the algorithm described in this paper 
was noise-like with an RMS ~ 1 //Jy/beam. The squint correc- 
tion results in an improvement of the noise figure by a factor 
of ~ 10. The visibilities with pointing errors were simulated 
by predicting the model visibilities using the forward transform 
with 



(section 



5.1 



Ey computed for increments of 10° in PA. A 
model for the VLA aperture illumination pattern ( |Brisken|2003| l 
was used to generate a non-azimuthally symmetric power pat- 
tern. The model includes the geometry of the sub-reflector and 
the feed position as well as the aperture blockage due to the feed 
legs and the sub-reflector The averages of the pointing errors for 
each antenna were randomly distributed between +25" with an 
RMS of 5". The images were deconvolved using standard image 
deconvolution procedures and again using the above algorithm. 



The Stokes-I images are shown in Fig. |4] As expected, the de- 
convolution errors were maximal for the sources around the half- 
power point and in the first sidelobe of the power pattern. These 
errors were eliminated when the pointing and squint correction 
were applied during deconvolution. The bottom panels of Fig. |4] 
show the Stokes-V images without and with pointing and squint 
corrections. 



6.2. VLA 1.4 GHz data 

The algorithm was also tested for Stokes-I and -V imaging us- 
ing VLA 1.4 GHz observations of the superthin galaxy IC2233 
( Matthews & Uson|2008j . The field contains two strong sources 
(~ 854 mJy/beam and ~ 145 mJy/beam) on opposite sides of the 
pointing center, located at positions of ~ 75% and ~ 35% pri- 
mary beam response levels respectively. The observations were 
made in spectral mode with channels of width ~ 24 kHz for a to- 
tal of ~ 11.6 hours in 2 passes with well distributed uv-coverage. 
The line-free channels (11 from the second "IF pair") were used 
for the tests described here. The aperture illumination pattern for 
each antenna was assumed to be the same and computed using 
the model for VLA antennas (Brisken 2003). The aperture illu- 
minations were computed as a function of PA in increments of 
1°. The expected thermal noise for this data is ~ 0.13 mJy/beam. 
The results of the imaging run with and without the correction 
for time-varying primary beam gains and polarization squint are 
shown in Figs. [5] and [6] The peak negative and positive resid- 
ual in the Stokes-V images without primary beam correction is 
-9.7 mJy/beam and 2.8 mJy/beam respectively. After the pri- 
mary beam correction, the peaks were +0.5 mJy/beam and un- 
correlated with location of the bright sources, with an RMS noise 
of 0.15 m Jy/beam. 



7. Error analysis 

Convergence of the deconvolution iterations is judged by the 
statistics in the residual image at convergence. Errors in the final 
residual image purely due to primary beam (PB) effects (within 
the main-lobe) can be expressed as: 



/* = 2P5F(.A)*[APB(^)r] 



(13) 
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where '★' represents the convolution operator, ij/ is the feed 
Parallactic Angle, APB{i^) is the error between the true and the 
assumed primary beam model at PA= tfr, 1° is the true sky distri- 
bution and PSF(t//) is the instantaneous (snapshot) PSF When 
imaging using an azimuthally symmetric PB model, the PB er- 
ror pattern is given by APB(t(r) = PB - PB(t//) where PB is the 
azimuthally averaged PB. Rotation of this error pattern on the 
sky contributes the dominant systematic errors in the residual 
image (and consequently in the final deconvolved image). The 
peak residual can be estimated for a point source of flux density 
S located at the position of the peak of the error pattern mul- 
tiplied by the maximum sidelobe of the instantaneous PSF at 
PA=(A,: 

^"L. = [ F,u,,.,i,,m„aJ [ ^PmdLa.] S (14) 

Instantaneous Stokes-I and -V VLA antenna power patterns 
at 1.4 GHz are shown in Fig. |7] The patterns rotate on the sky 
with PA which results in time-varying, position-dependent gain 
across the field of view. The APB and an azimuthal cut through 
this error pattern at 50%, 10% and 1% point of PB are shown in 
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Fig. 5. The Stolces-I images at 1.4 GHz witii VLA C-array observation. Tlie left panel shows the deconvolved image without corrections of the 
antenna power pattern variations as a function of parallactic angle. The right panel shows the result from the algorithm described in this paper 
The two dominant sources, on either side of the pointing center have flux densities of ~ 854 mjy/beam and ~ 145 mjy/beam and the RMS noise 
is 0.15 mJy/beam. A linear transfer function with a range ±1.5 mJy/beam was used to make these images. 
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Fig. 6. The Stokes-V images for a 1.4 GHz VLA C-array data. The left and right panels show the images without and with PB-corrections using 
a linear transfer function with a range ±1.5 mJy/beam was used to make these images. Errors due to the polarization squint are greater than 
thermal noise in the left image. The RMS in the right-hand-side image is ~ 0.15 mJy/beam. The negative and positive peaks in this image are 
±0.5 mJy/beam. 



Figs.[8jand |9jrespectively. The contours correspond to PB„,a^ x 
[0.01, 0.05, 0.07, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95]. Within 
the main-lobe of the Stokes-I beam, the peak of the error 
pattern is where PB is between 1 - 10%. For 5 = 1 Jy, 
PSF.iMohei^f'oXnax = 40% (measured for the test 1.4 GHz, C- 
array data) and APB(4ro)\,„ax - 0.005, the peak residual will be 
about 2 mJy in Stokes-I. Peak residuals in Stokes-V would be at 
the level of about 10 mJy. 



The deconvolution algorithm described above consists es- 
sentially of approximating the function shown in Fig. |9] (or 
equivalently the 2D function shown in Fig. [T]) by a piece-wise 
constant function. The maximum error due to such an approxi- 
mation can be estimated using the following equation: 

IL. - A/i,[A,A] (15) 

R r n dPB 

where = S [PS F sidelobe\max\ 




Fig. 7. Model for the VLA 1.4 GHz antenna at Parallactic Angle ~ 80°. 
The Stokes- V pattern ([PBn]; - PBii]/2) is shown in colour/gray scale 
with the contours of the Stokes-I power pattern superimposed. Dark 
regions in the gray scale image represent negative values due to the 
polarization squint of VLA antenna. 
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Fig. 8. Difference between an instantaneous Stokes-I PB and an az- 
imuthally averaged PB {PB). The colour/gray scale image is APB = 
PB(il/o) - PB while the contours are for the PB. 



For a required RMS noise in the image of rj, the peak error due 
to the piece-wise constant approximation should be 3-5 times 
smaller than rj. The minimum PA increment such that the RMS 
noise in the image is not limited by the piece-wise constant ap- 
proximation would be given by 



< 377/A/i 



(16) 



For the 1.4 GHz, VLA C-aiTay test data, 
^1 ~ 0.0003 deg-\ For a PA increment of 10°, 
peak residuals in Stokes-I and -V would be about 1 mJy and 



5 mJy respectively. With the expected thermal sensitivity of 
0. 1 mJy/beam for the 1 .4 GHz test data we used, PA increments 
of 1 ° were required. 

The PA increment for higher sensitivity telescopes like the 
EVLA or SKA such that imaging is not limited by the above 
approximation will be much smaller This requirement how- 
ever can be significantly relaxed by approximating the eiTor 
function by a piecewise linear approximation (interpolation of 
the functions computed at larger PA increments). Furthermore, 
since image interpolation itself can be expensive, caching of pre- 
computed aperture functions at appropriate PA increments will 
be necessary. Note that the gridding cost is relatively insensitive 
to the number of convolution functions used. A hybrid approach 
of FFT based transforms plus analytical computations for the 
strongest sources will probably deliver optimal performance. 

8. Discussion 

Some residual deconvolution errors are still left around the sec- 
ond strongest source in Fig. [5] The pattern in the residual image 
(not shown) suggests that these errors are due to image pixela- 



tion ( |Voronkov & Wieringa|2004l [Cotton & Uson|2007| ). More 
sophisticated parametrization of the sky, independent of the im- 
age pixel size (e.g. as is done in scale sensitive deconvolution 
algorithms like the Asp-Clean (jBhatnagar & Cornwell 2004) or 
MS -Clean) along with the imaging algorithm described here to 
coiTect for DD gains should give better results. It is also possible 
that the residual eiTors are due to pointing eiTors during the ob- 
servation. We are investigating this possibility and hope to report 
on it in due course. 

8.1. Why not Peeling? 

The algorithm described here corrects for DD gains without 
loosing the efficiency advantage of the FFT algorithm. Our al- 
gorithm scales well in run-time efficiency and implementation 
complexity for large data volume, complex field as well as for 
arrays where antenna elements cannot be assumed to be identi- 
cal. 

Various variants of the "Peeling" algorithm can also be used 
to correct for direction-dependent gains. In this approach an- 
tenna based gains are determined in the direction of each com- 
pact source. These gains are then used to subtract the contri- 
bution of compact sources from the observed data using a Direct 
Fourier Transform (DFT) and the residual visibilities are imaged 
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again. While this is useful in removing the artifacts due to strong 
compact sources, since the gains are determined independently 
for each direction in the sky, as the image complexity increases, 
too many degrees of freedom (DoF) might be added to the prob- 
lem. For crowded fields (large number of compact sources), this 
leads to a proliferation of DoFs and potentially to the problem 
of over-fitting (the extreme case being when each pixel in the 
image has an associated independent gain which gives the best- 
fit result). Since DFTs have to be used to compute residuals, 
the computing load is also significantly higher than the corre- 
sponding one for FFT-based computation of residuals. For com- 
plex fields containing extended emission, this approach quickly 
becomes numerically un-viable because of the large number of 
DoFs included as well as the high computing and I/O loads in- 
volved. Therefore, while variants of the Peeling algorithm could 
have given better results for the particular 1 .4 GHz VLA data 
that we have used in this paper, we did not resort to Peeling 
based algorithms. However since the goal here is to demonstrate 
the effectiveness of the algorithm in correcting for otherwise dif- 
ficult to correct DD gains, we used this relatively simple field for 
our tests so that the advantages and limitations of our algorithm 
are brought to the fore. 

Of course, using a direct Fourier transform (DFT) for pre- 
dicting model visibilities rather than using the FFT algorithm 
will give the most accurate results. While such a brute-force ap- 
proach might be useful for simple fields, as mentioned above, the 
computing cost for even such simple fields becomes prohibitive 
for data with more than a few frequency channels, even when 
assuming that the various antenna elements are identical. For 
cases where this assumption breaks down, as it does even for the 
simple case of random antenna pointing errors, computing costs 
are impractically high. Furthermore, for full-beam, full-Stokes 
imaging, which requires use of at least the diagonal terms of the 
Mueller matrix (Eq. [4]i if not the full matrix, it is unclear if a 
brute-force DFT approach will work. 

8.2. Implications for wideband and mosaic imaging 

Antenna pointing errors, azimuthally asymmetric aperture illu- 
minations, wide bandwidths and deconvolution errors due to the 
use of discrete pixels for the sky representation all leave residu- 
als that limit the full-beam imaging dynamic range to lO'* - 10^. 
Therefore, apart from correcting for the direction dependent ef- 
fects, for the highest imaging dynamic range, scale-sensitive de- 
composition of the sky might also be necessary ( [Bhatnagar & 
|Cornwell|2004| ). The algorithm described here can be combined 
efficiently with scale-sensitive deconvolution and has the poten- 
tial of overcoming the above mentioned imaging dynamic range 
limit. 

The algorithm described here accounts for the time varying 
gain variations due to the rotation of the azimuthally asymmetric 
aperture illumination with PA. For imaging with a large band- 
width ratio (e.g. the ratio of frequencies at the two edges of the 
observing band for EVLA will be 2:1), the dominant error term 
will be the scaling of the power pattern with frequency. Sources 
which will be well within the main lobe of the primary beam 
at the lower frequency end of the band will be outside the main 
lobe at the higher frequency end (and may even appear in the 
first sidelobe). Since the azimuthal variations in the power pat- 
tern due to feed-leg/sub-reflector blockage are maximal close to 
the null and in the first sidelobe, frequency scaling of the aper- 
ture illumination will contribute a first-order error. 

Scaling of the antenna power patterns with frequency in ob- 
servations with wide bandwidths can be incorporated in the al- 



gorithm described here by computing the aperture illumination 
functions at appropriate increments in frequency. Alternatively, 
depending on the required accuracy, this scaling can be achieved 
as well by scaling the co-ordinates with frequency. Since the 
computing cost scales weakly with the number of convolution 
functions used, the extra computing load will not be too high. 
The cost of computing the aperture functions itself will be sig- 
nificant, but it is a one-time cost. 

Rotation of the sidelobes results in gain variations on the or- 
der of a factor of two in the direction of the sidelobes. For mo- 
saic observations, this will contribute significant time-varying 
flux density in most individual pointings; assuming a peak PSF 
sidelobe of 10%, the error inside the main lobe of the power pat- 
tern will be at the level of a few percent of the peak flux in the 
direction of the first sidelobe of the antenna power pattern. This 
will limit the mosaicking dynamic range significantly, indeed it 
will be a first-order effect. In addition, a second-order effect will 
be due to antenna pointing errors. Correction of both of these 
effects will be required for mosaicking instruments presently 
under construction like the ALMA and the ASKAP Pohnston] 
2007 ). The general framework and the algorithm described here 
can be generalized easily for application to mosaic imaging and 
could correct errors due to the rotation of antenna primary beams 
as well as pointing errors. In practice however, the imaging dy- 
namic range might be limited by the precision with which an- 
tenna power patterns and pointing errors can be determined. 
Furthermore, a similar approach can be used for imaging with 
inhomogeneous arrays like CARMA/ALMA (where not all an- 
tennas in the array are identical) as well as with arrays with 
multi-feed antennas like the ASKAP. Finally, for very high dy- 
namic range imaging with telescopes like LOFAR, SKA, and 
even EVLA, nominally identical antenna elements may have 
variations that will induce errors higher than the thermal noise 
limit. In that sense, such telescopes will also need to be treated 
as inhomogeneous arrays. 

9. Conclusions 

Existing imaging algorithms ignore the effects of time varying 
gains due to antenna pointing errors and rotation of azimuthally 
asymmetric antenna power patterns. As shown in sections |6]and 
|7] using the VLA as an example, residual errors due to these ef- 
fects are maximal in the first sidelobe and significant even within 
the main lobe of the antenna power pattern. Simulations show 
that the errors due to these effects limit the achievable dynamic 
range for sensitive radio interferometers under construction like 
the EVLA, ALMA and ASKAP. The full polarimetric response 
of the antenna is also inherently asymmetric due to the physics of 
reflection from curved surfaces. In addition, the voltage patterns 
of the two orthogonal polarizations for the VLA antennas are 
separated on the sky (polarization squint) resulting in increasing 
instrumental Stokes-V as a function of distance from the im- 
age center. Therefore, for moderate dynamic range full-beam, 
full-Stokes imaging, the Sky Jones matrix for the VLA antennas 
cannot be assumed to be scaled-identity or even diagonal. Even 
for antennas without polarization squint, the off-diagonal terms 
will remain significant, even though the difference between the 
parallel-hand terms may be negligible. Hence, full-Sky Jones 
matrix treatment is necessary for full Stokes imaging of most 
observed fields. This implies a four-fold increase in the comput- 
ing load when compared to imaging when primary beam effects 
are neglected. 

The deconvolution algorithm described in this paper corrects 
for systematic effects due to non-ideal primary beams by mod- 
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eling the complex antenna aperture illumination for the two or- 
thogonal polarizations as a function of parallactic angle and an- 
tenna pointing errors. The antenna aperture functions are used to 
construct precise forward and approximate inverse transforms, 
exploiting the property that the Sky Jones matrix is approxi- 
mately unitary. We have applied this algorithm to VLA 1 .4 GHz 
imaging and show that the instrumental Stokes-V is eUminated 
to an accuracy of better than 10%, possibly limited by uncertain- 
ties in our model of the primary beam as well as pointing errors 
that were not corrected in this reduction because our algorithm 
does not yet handle pointing errors and selfcalibration simulta- 
neously. Simulations of single pointing observations at 1 .4 GHz 
with the EVLA with typical time-varying antenna pointing er- 
rors show that antenna pointing errors limit the imaging dynamic 
range at a level of ~ 10^ : 1 away from strong sources in a typ- 
ical field (imaging dynamic range could be even lower if you 
are unlucky, like for the IC2233 case). Using this algorithm on 
this simulated data we demonstrate that the effects of antenna 
pointing errors can also be corrected during deconvolution. This 
approach can therefore be used for full-beam full-Stokes imag- 
ing. 

Finally, we note that in the presence of image plane er- 
rors, imaging and cahbration algorithms are more tightly cou- 
pled compared to those appropriate to direction-independent cal- 
ibration. Solvers for parameters which describe direction depen- 
dent errors require the forward transform used during imaging. 
Correction for direction dependent effects is done during im- 
age deconvolution and one cannot produce corrected visibih- 
ties independent of full image deconvolution. With the advent of 
higher sensitivity arrays where many direction dependent errors 
will need to be accounted for, modem imaging and calibration 
software must be designed to easily accommodate these cases. 
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